Scalable Consistency in T-Coffee Through Apache Spark and Cassandra Database
نویسندگان
چکیده
منابع مشابه
Understanding the Causes of Consistency Anomalies in Apache Cassandra
A recent paper on benchmarking eventual consistency showed that when a constant workload is applied against Cassandra, the staleness of values returned by read operations exhibits interesting but unexplained variations when plotted against time. In this paper we reproduce this phenomenon and investigate in greater depth the low-level mechanisms that give rise to stale reads. We show that the st...
متن کاملSearchable Encryption in Apache Cassandra
In today’s cloud computing applications it is common practice for clients to outsource their data to cloud storage providers. That data may contain sensitive information, which the client wishes to protect against this untrustworthy environment. Confidentiality can be preserved by the use of encryption. Unfortunately that makes it difficult to perform efficient searches. There are a couple of d...
متن کاملScalable SDE Filtering and Inference with Apache Spark
In this paper, we consider the problem of Bayesian filtering and inference for time series data modeled as noisy, discrete-time observations of a stochastic differential equation (SDE) with undetermined parameters. We develop a Metropolis algorithm to sample from the high-dimensional joint posterior density of all SDE parameters and state time series. Our approach relies on an innovative densit...
متن کاملOptimal adaption for Apache Cassandra
Apache Cassandra is a NoSql database offering high scalability and availability. Among with its competitors, e.g. Hbase, SympleDB and BigTable, Cassandra is a widely used platform for big data systems. Tuning the performance of those systems is a complex task and there is a growing demand for autonomic management solutions. In this paper we present an energy-aware adaptation model built from a ...
متن کاملApproximate Stream Analytics in Apache Flink and Apache Spark Streaming
Approximate computing aims for efficient execution of workflows where an approximate output is sufficient instead of the exact output. The idea behind approximate computing is to compute over a representative sample instead of the entire input dataset. Thus, approximate computing — based on the chosen sample size — can make a systematic trade-off between the output accuracy and computation effi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Computational Biology
سال: 2018
ISSN: 1557-8666
DOI: 10.1089/cmb.2018.0084